3,515 research outputs found

    On the Complexity of Rule Discovery from Distributed Data

    Get PDF
    This paper analyses the complexity of rule selection for supervised learning in distributed scenarios. The selection of rules is usually guided by a utility measure such as predictive accuracy or weighted relative accuracy. Other examples are support and confidence, known from association rule mining. A common strategy to tackle rule selection from distributed data is to evaluate rules locally on each dataset. While this works well for homogeneously distributed data, this work proves limitations of this strategy if distributions are allowed to deviate. To identify those subsets for which local and global distributions deviate may be regarded as an interesting learning task of its own, explicitly taking the locality of data into account. This task can be shown to be basically as complex as discovering the globally best rules from local data. Based on the theoretical results some guidelines for algorithm design are derived. --

    Comparing Knowledge-Based Sampling to Boosting

    Get PDF
    Boosting algorithms for classification are based on altering the ini- tial distribution assumed to underly a given example set. The idea of knowledge-based sampling (KBS) is to sample out prior knowledge and previously discovered patterns to achieve that subsequently ap- plied data mining algorithms automatically focus on novel patterns without any need to adjust the base algorithm. This sampling strat- egy anticipates a user's expectation based on a set of constraints how to adjust the distribution. In the classified case KBS is similar to boosting. This article shows that a specific, very simple KBS algo- rithm is able to boost weak base classifiers. It discusses differences to AdaBoost.M1 and LogitBoost, and it compares performances of these algorithms empirically in terms of predictive accuracy, the area under the ROC curve measure, and squared error. --

    Boosting Classifiers for Drifting Concepts

    Get PDF
    This paper proposes a boosting-like method to train a classifier ensemble from data streams. It naturally adapts to concept drift and allows to quantify the drift in terms of its base learners. The algorithm is empirically shown to outperform learning algorithms that ignore concept drift. It performs no worse than advanced adaptive time window and example selection strategies that store all the data and are thus not suited for mining massive streams. --

    Lifetime Earnings and Life Expectancy

    Get PDF
    We estimate remaining life expectancy at age 65 using a very large sample of male German pensioners. Our analysis is entirely nonparametric. Furthermore, the data enable us to compare life expectancy in eastern and western Germany conditional on a measure of socio-economic status. Our findings show a lower bound of almost fifty percent (six years) on the difference in remaining life expectancy between the lowest and the highest socio-economic group considered. Within groups, we find similar values for East and West. Our analysis contributes to the literature in several aspects. First, Germany is clearly underrepresented in differential mortality studies. Second, we are able to use a novel measure of lifetime earnings as a proxy for socio-economic status that remains valid for retired people. Third, the comparison of eastern and western Germany may provide some interesting insights for transformation countries.

    Lifetime earnings and life expectancy

    Get PDF
    We estimate remaining life expectancy at age 65 using a very large sample of male German pensioners. Our analysis is entirely nonparametric. Furthermore, the data enable us to compare life expectancy in eastern and western Germany conditional on a measure of socio-economic status. Our findings show a lower bound of almost fifty percent (six years) on the difference in remaining life expectancy between the lowest and the highest socio-economic group considered. Within groups, we find similar values for East and West. Our analysis contributes to the literature in several aspects. First, Germany is clearly underrepresented in differential mortality studies. Second, we are able to use a novel measure of lifetime earnings as a proxy for socio-economic status that remains valid for retired people. Third, the comparison of eastern and western Germany may provide some interesting insights for transformation countries.Germany, life expectancy, pensioners, socio-economic differentials

    Differential mortality by lifetime earnings in Germany

    Get PDF
    e estimate mortality rates by a measure of socio-economic status in a very large sample of male German pensioners aged~65 or older. Our analysis is entirely nonparametric. Furthermore, the data enable us to compare mortality experiences in eastern and western Germany conditional on socio-economic status. As a simple summary measure, we compute period life expectancies at age~65. Our findings show a lower bound of almost 50 percent (six years) on the difference in life expectancy between the lowest and the highest socio-economic group considered. Within groups, we find similar values for the former GDR and western Germany. Our analysis contributes to the literature in three aspects. First, we provide the first population-based differential mortality study for Germany. Second, we use a novel measure of lifetime earnings as a proxy for socio-economic status that remains applicable to retired people. Third, the comparison between eastern and western Germany may provide some interesting insights for transformation countries.comparison East and West Germany, lifetime earnings measure, mortality and socio-economic status

    Horstsaat von Buschbohnen bietet Möglichkeit zu ‚InRow‘-Hackmaßnahmen

    Get PDF
    Am Sächsischen Landesamt für Umwelt, Landwirtschaft und Geologie in Dresden-Pillnitz wurde 2017 die Horstsaat von Buschbohnen in Hinblick auf ihr Potential von ‚InRow‘-Hackmaßnahmen untersucht. Bei relativ einheitlicher Bestandesdichte von 36 Pfl./m² zeigten Horstsaatvarianten mit Ablage von 5 oder 7 Korn/Horst bzw. 26,0/36,4 cm Horstabstand maximal 10 % niedrigere Erträge als eine standardmäßige Einzelkornsaat mit 5,2 cm Kornablageabstand. Eine Beeinträchtigung der maschinellen Pflückbarkeit bei Horstsaat konnte nicht festgestellt werden. Bei Horstsaat konnten bis zu 67 % des Reihenbereichs gehackt werden, was sich in einer entsprechenden Reduzierung des Jätaufwandes widerspiegelte

    Adaptation of NLP Techniques to Cultural Heritage Research and Documentation

    Get PDF
    The WissKI system provides a framework for ontology based science communication and cultural heritage documentation. In many cases, the documentation consists of semi-structured data records with free text fields. Most references in the texts comprise of person and place names, as well as time specifications. We present the WissKI tools for semantic annotation using controlled vocabularies and formal ontologies derived from CIDOC Conceptual Reference Model (CRM). Current research deals with the annotations as building blocks for event recognition. Finally, we outline how the CRM helps to build bridges between documentation in different scientific disciplines

    High-fidelity simulation increases obstetric self-assurance and skills in undergraduate medical students

    Get PDF
    Objective: Teaching intrapartum care is one of the most challenging tasks in undergraduate medical education. High-fidelity obstetric simulators might support students' learning experience. The specific educational impact of these simulators compared with traditional methods of model-based obstetric teaching has not yet been determined. Study design: We randomly assigned 46 undergraduate medical students to be taught using either a high-fidelity simulator or a scale wood-and-leather phantom. Their self-assessments were evaluated using a validated questionnaire. We assessed obstetric skills and asked students to solve obstetric paper cases. Main outcome measures: Assessment of fidelity-specific teaching impact on procedural knowledge, motivation, and interest in obstetrics as well as obstetric skills using high- and low-fidelity training models. Results: High-fidelity simulation specifically improved students' feeling that they understood both the physiology of parturition and the obstetric procedures. Students in the simulation group also felt better prepared for obstetric house jobs and performed better in obstetric skills evaluations. However, the two groups made equivalent obstetric decisions. Conclusion: This study provides first data on the impact of high-fidelity simulation in an undergraduate setting
    corecore